107 research outputs found

    A Bayesian method for evaluating and discovering disease loci associations

    Get PDF
    Background: A genome-wide association study (GWAS) typically involves examining representative SNPs in individuals from some population. A GWAS data set can concern a million SNPs and may soon concern billions. Researchers investigate the association of each SNP individually with a disease, and it is becoming increasingly commonplace to also analyze multi-SNP associations. Techniques for handling so many hypotheses include the Bonferroni correction and recently developed Bayesian methods. These methods can encounter problems. Most importantly, they are not applicable to a complex multi-locus hypothesis which has several competing hypotheses rather than only a null hypothesis. A method that computes the posterior probability of complex hypotheses is a pressing need. Methodology/Findings: We introduce the Bayesian network posterior probability (BNPP) method which addresses the difficulties. The method represents the relationship between a disease and SNPs using a directed acyclic graph (DAG) model, and computes the likelihood of such models using a Bayesian network scoring criterion. The posterior probability of a hypothesis is computed based on the likelihoods of all competing hypotheses. The BNPP can not only be used to evaluate a hypothesis that has previously been discovered or suspected, but also to discover new disease loci associations. The results of experiments using simulated and real data sets are presented. Our results concerning simulated data sets indicate that the BNPP exhibits both better evaluation and discovery performance than does a p-value based method. For the real data sets, previous findings in the literature are confirmed and additional findings are found. Conclusions/Significance: We conclude that the BNPP resolves a pressing problem by providing a way to compute the posterior probability of complex multi-locus hypotheses. A researcher can use the BNPP to determine the expected utility of investigating a hypothesis further. Furthermore, we conclude that the BNPP is a promising method for discovering disease loci associations. © 2011 Jiang et al

    Genome-wide association study of Alzheimer's disease

    Get PDF
    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ∼2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69–180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P=3.05E–07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples

    Comparison of Handaxes from Bose Basin (China) and the Western Acheulean Indicates Convergence of Form, Not Cognitive Differences

    Get PDF
    Alleged differences between Palaeolithic assemblages from eastern Asia and the west have been the focus of controversial discussion for over half a century, most famously in terms of the so-called ‘Movius Line’. Recent discussion has centered on issues of comparability between handaxes from eastern Asian and ‘Acheulean’ examples from western portions of the Old World. Here, we present a multivariate morphometric analysis in order to more fully document how Mid-Pleistocene (i.e. ∼803 Kyr) handaxes from Bose Basin, China compare to examples from the west, as well as with additional (Mode 1) cores from across the Old World. Results show that handaxes from both the western Old World and Bose are significantly different from the Mode 1 cores, suggesting a gross comparability with regard to functionally-related form. Results also demonstrate overlap between the ranges of shape variation in Acheulean handaxes and those from Bose, demonstrating that neither raw material nor cognitive factors were an absolute impediment to Bose hominins in making comparable handaxe forms to their hominin kin west of the Movius Line. However, the shapes of western handaxes are different from the Bose examples to a statistically significant degree. Moreover, the handaxe assemblages from the western Old World are all more similar to each other than any individual assemblage is to the Bose handaxes. Variation in handaxe form is also comparatively high for the Bose material, consistent with suggestions that they represent an emergent, convergent instance of handaxe technology authored by Pleistocene hominins with cognitive capacities directly comparable to those of ‘Acheulean’ hominins

    Genome-wide association reveals genetic effects on human Aβ<sub>42 </sub>and τ protein levels in cerebrospinal fluids: a case control study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alzheimer's disease (AD) is common and highly heritable with many genes and gene variants associated with AD in one or more studies, including APOE ε2/ε3/ε4. However, the genetic backgrounds for normal cognition, mild cognitive impairment (MCI) and AD in terms of changes in cerebrospinal fluid (CSF) levels of Aβ<sub>1-42</sub>, T-tau, and P-tau<sub>181P</sub>, have not been clearly delineated. We carried out a genome-wide association study (GWAS) in order to better define the genetic backgrounds to these three states in relation to CSF levels.</p> <p>Methods</p> <p>Subjects were participants in the Alzheimer's Disease Neuroimaging Initiative (ADNI). The GWAS dataset consisted of 818 participants (mainly Caucasian) genotyped using the Illumina Human Genome 610 Quad BeadChips. This sample included 410 subjects (119 Normal, 115 MCI and 176 AD) with measurements of CSF Aβ<sub>1-42</sub>, T-tau, and P-tau<sub>181P </sub>Levels. We used PLINK to find genetic associations with the three CSF biomarker levels. Association of each of the 498,205 SNPs was tested using additive, dominant, and general association models while considering APOE genotype and age. Finally, an effort was made to better identify relevant biochemical pathways for associated genes using the ALIGATOR software.</p> <p>Results</p> <p>We found that there were some associations with APOE genotype although CSF levels were about the same for each subject group; CSF Aβ<sub>1-42 </sub>levels decreased with APOE gene dose for each subject group. T-tau levels tended to be higher among AD cases than among normal subjects. From adjusted result using APOE genotype and age as covariates, no SNP was associated with CSF levels among AD subjects. <it>CYP19A1 </it>'aromatase' (rs2899472), <it>NCAM2</it>, and multiple SNPs located on chromosome 10 near the <it>ARL5B </it>gene demonstrated the strongest associations with Aβ<sub>1-42 </sub>in normal subjects. Two genes found to be near the top SNPs, <it>CYP19A1 </it>(rs2899472, p = 1.90 × 10<sup>-7</sup>) and <it>NCAM2 </it>(rs1022442, p = 2.75 × 10<sup>-7</sup>) have been reported as genetic factors related to the progression of AD from previous studies. In AD subjects, APOE ε2/ε3 and ε2/ε4 genotypes were associated with elevated T-tau levels and ε4/ε4 genotype was associated with elevated T-tau and P-tau<sub>181P </sub>levels. Pathway analysis detected several biological pathways implicated in Normal with CSF β-amyloid peptide (Aβ<sub>1-42</sub>).</p> <p>Conclusions</p> <p>Our genome-wide association analysis identified several SNPs as important factors for CSF biomarker. We also provide new evidence for additional candidate genetic risk factors from pathway analysis that can be tested in further studies.</p

    A genome-wide association study for late-onset Alzheimer's disease using DNA pooling

    Get PDF
    Background: Late-onset Alzheimer's disease (LOAD) is an age related neurodegenerative disease with a high prevalence that places major demands on healthcare resources in societies with increasingly aged populations. The only extensively replicable genetic risk factor for LOAD is the apolipoprotein E gene. In order to identify additional genetic risk loci we have conducted a genome-wide association (GWA) study in a large LOAD case – control sample, reducing costs through the use of DNA pooling. Methods: DNA samples were collected from 1,082 individuals with LOAD and 1,239 control subjects. Age at onset ranged from 60 to 95 and Controls were matched for age (mean = 76.53 years, SD = 33), gender and ethnicity. Equimolar amounts of each DNA sample were added to either a case or control pool. The pools were genotyped using Illumina HumanHap300 and Illumina Sentrix HumanHap240S arrays testing 561,494 SNPs. 114 of our best hit SNPs from the pooling data were identified and then individually genotyped in the case – control sample used to construct the pools. Results: Highly significant association with LOAD was observed at the APOE locus confirming the validity of the pooled genotyping approach. For 109 SNPs outside the APOE locus, we obtained uncorrected p-values ≤ 0.05 for 74 after individual genotyping. To further test these associations, we added control data from 1400 subjects from the 1958 Birth Cohort with the evidence for association increasing to 3.4 × 10-6 for our strongest finding, rs727153. rs727153 lies 13 kb from the start of transcription of lecithin retinol acyltransferase (phosphatidylcholine – retinol O-acyltransferase, LRAT). Five of seven tag SNPs chosen to cover LRAT showed significant association with LOAD with a SNP in intron 2 of LRAT, showing greatest evidence of association (rs201825, p-value = 6.1 × 10-7). Conclusion: We have validated the pooling method for GWA studies by both identifying the APOE locus and by observing a strong enrichment for significantly associated SNPs. We provide evidence for LRAT as a novel candidate gene for LOAD. LRAT plays a prominent role in the Vitamin A cascade, a system that has been previously implicated in LOAD

    Age-Specific Epigenetic Drift in Late-Onset Alzheimer's Disease

    Get PDF
    Despite an enormous research effort, most cases of late-onset Alzheimer's disease (LOAD) still remain unexplained and the current biomedical science is still a long way from the ultimate goal of revealing clear risk factors that can help in the diagnosis, prevention and treatment of the disease. Current theories about the development of LOAD hinge on the premise that Alzheimer's arises mainly from heritable causes. Yet, the complex, non-Mendelian disease etiology suggests that an epigenetic component could be involved. Using MALDI-TOF mass spectrometry in post-mortem brain samples and lymphocytes, we have performed an analysis of DNA methylation across 12 potential Alzheimer's susceptibility loci. In the LOAD brain samples we identified a notably age-specific epigenetic drift, supporting a potential role of epigenetic effects in the development of the disease. Additionally, we found that some genes that participate in amyloid-β processing (PSEN1, APOE) and methylation homeostasis (MTHFR, DNMT1) show a significant interindividual epigenetic variability, which may contribute to LOAD predisposition. The APOE gene was found to be of bimodal structure, with a hypomethylated CpG-poor promoter and a fully methylated 3′-CpG-island, that contains the sequences for the ε4-haplotype, which is the only undisputed genetic risk factor for LOAD. Aberrant epigenetic control in this CpG-island may contribute to LOAD pathology. We propose that epigenetic drift is likely to be a substantial mechanism predisposing individuals to LOAD and contributing to the course of disease

    Data mining of high density genomic variant data for prediction of Alzheimer's disease risk

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways. Identifying multiple interacting genetic mutations associated with disease remains challenging in studying the etiology of complex diseases. And although recently new single nucleotide polymorphisms (SNPs) at genes implicated in immune response, cholesterol/lipid metabolism, and cell membrane processes have been confirmed by genome-wide association studies (GWAS) to be associated with late-onset Alzheimer's disease (LOAD), a percentage of AD heritability continues to be unexplained. We try to find other genetic variants that may influence LOAD risk utilizing data mining methods.</p> <p>Methods</p> <p>Two different approaches were devised to select SNPs associated with LOAD in a publicly available GWAS data set consisting of three cohorts. In both approaches, single-locus analysis (logistic regression) was conducted to filter the data with a less conservative p-value than the Bonferroni threshold; this resulted in a subset of SNPs used next in multi-locus analysis (random forest (RF)). In the second approach, we took into account prior biological knowledge, and performed sample stratification and linkage disequilibrium (LD) in addition to logistic regression analysis to preselect loci to input into the RF classifier construction step.</p> <p>Results</p> <p>The first approach gave 199 SNPs mostly associated with genes in calcium signaling, cell adhesion, endocytosis, immune response, and synaptic function. These SNPs together with <it>APOE and GAB2 </it>SNPs formed a predictive subset for LOAD status with an average error of 9.8% using 10-fold cross validation (CV) in RF modeling. Nineteen variants in LD with <it>ST5, TRPC1, ATG10, ANO3, NDUFA12, and NISCH </it>respectively, genes linked directly or indirectly with neurobiology, were identified with the second approach. These variants were part of a model that included <it>APOE </it>and <it>GAB2 </it>SNPs to predict LOAD risk which produced a 10-fold CV average error of 17.5% in the classification modeling.</p> <p>Conclusions</p> <p>With the two proposed approaches, we identified a large subset of SNPs in genes mostly clustered around specific pathways/functions and a smaller set of SNPs, within or in proximity to five genes not previously reported, that may be relevant for the prediction/understanding of AD.</p

    Meta-Analysis for Genome-Wide Association Study Identifies Multiple Variants at the BIN1 Locus Associated with Late-Onset Alzheimer's Disease

    Get PDF
    Recent GWAS studies focused on uncovering novel genetic loci related to AD have revealed associations with variants near CLU, CR1, PICALM and BIN1. In this study, we conducted a genome-wide association study in an independent set of 1034 cases and 1186 controls using the Illumina genotyping platforms. By coupling our data with available GWAS datasets from the ADNI and GenADA, we replicated the original associations in both PICALM (rs3851179) and CR1 (rs3818361). The PICALM variant seems to be non-significant after we adjusted for APOE e4 status. We further tested our top markers in 751 independent cases and 751 matched controls. Besides the markers close to the APOE locus, a marker (rs12989701) upstream of BIN1 locus was replicated and the combined analysis reached genome-wide significance level (p = 5E-08). We combined our data with the published Harold et al. study and meta-analysis with all available 6521 cases and 10360 controls at the BIN1 locus revealed two significant variants (rs12989701, p = 1.32E-10 and rs744373, p = 3.16E-10) in limited linkage disequilibrium (r2 = 0.05) with each other. The independent contribution of both SNPs was supported by haplotype conditional analysis. We also conducted multivariate analysis in canonical pathways and identified a consistent signal in the downstream pathways targeted by Gleevec (P = 0.004 in Pfizer; P = 0.028 in ADNI and P = 0.04 in GenADA). We further tested variants in CLU, PICALM, BIN1 and CR1 for association with disease progression in 597 AD patients where longitudinal cognitive measures are sufficient. Both the PICALM and CLU variants showed nominal significant association with cognitive decline as measured by change in Clinical Dementia Rating-sum of boxes (CDR-SB) score from the baseline but did not pass multiple-test correction. Future experiments will help us better understand potential roles of these genetic loci in AD pathology

    Learning genetic epistasis using Bayesian network scoring criteria

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is <it>Multifactor Dimensionality Reduction </it>(MDR). Jiang et al. created a combinatorial epistasis learning method called <it>BNMBL </it>to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL.</p> <p>Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model.</p> <p>Results</p> <p>We evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimer's GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at <it>recall </it>using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimer's data set.</p> <p>Conclusions</p> <p>We conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives.</p
    • …
    corecore